🎭 AI Battle: Deepseek vs ChatGPT - A Comprehensive Analysis¢

πŸ” Which AI Performs Better? Let’s Uncover the Truth!ΒΆ


πŸ“– The Story Behind This AnalysisΒΆ

πŸ€– "In the fast-paced world of AI, two titans battle for dominance: Deepseek and ChatGPT. Each promises unparalleled intelligence, lightning-fast responses, and human-like understanding. But the real question remains...

Which AI truly delivers the best experience?"ΒΆ

Imagine a world where AI assistants handle billions of queries daily. Some users rave about their accuracy, while others complain about hallucinations and slow responses. Who should we trust?

This project dives deep into real-world user interactions to uncover the strengths and weaknesses of both AI models.


πŸ“Š What This Notebook CoversΒΆ

πŸ“Œ Exploratory Data Analysis (EDA) – Uncover hidden trends and insights πŸ”
πŸ“Œ Interactive Visualizations – Compare AI performance dynamically πŸ“ˆ
πŸ“Œ User-Based Filtering & Search – Analyze engagement and preferences 🎯
πŸ“Œ Outlier Detection & Handling – Clean and refine the data for accuracy πŸ› οΈ
πŸ“Œ Session Tracking – Understand long-term user behavior and AI adoption πŸ•΅οΈ

πŸš€ By the end of this notebook, you'll have a data-driven understanding of how Deepseek and ChatGPT compare across multiple dimensions!


🎬 Let the AI Battle Begin! πŸ”₯ΒΆ

InΒ [1]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
InΒ [2]:
import seaborn as sns
import matplotlib.pyplot as plt
InΒ [3]:
df=pd.read_csv('deepseek_vs_chatgpt.csv')
InΒ [4]:
df
Out[4]:
Date Month_Num Weekday AI_Platform AI_Model_Version Active_Users New_Users Churned_Users Daily_Churn_Rate Retention_Rate ... Session_Duration_sec Device_Type Language Response_Accuracy Response_Speed_sec Response_Time_Category Correction_Needed User_Return_Frequency Customer_Support_Interactions Region
0 2024-09-21 9 Saturday ChatGPT GPT-4-turbo 500000 25000 25000 0.05 0.95 ... 40 Mobile es 0.7842 3.30 Standard 0 6 2 Antarctica (the territory South of 60 deg S)
1 2024-09-21 9 Saturday ChatGPT GPT-4-turbo 500000 25000 25000 0.05 0.95 ... 24 Laptop/Desktop zh 0.8194 3.28 Standard 1 2 2 Ukraine
2 2024-09-21 9 Saturday ChatGPT GPT-4-turbo 500000 25000 25000 0.05 0.95 ... 34 Mobile en 0.8090 3.07 Standard 0 2 0 Grenada
3 2024-09-21 9 Saturday ChatGPT GPT-4-turbo 500000 25000 25000 0.05 0.95 ... 18 Mobile fr 0.8233 3.06 Standard 0 9 0 Guyana
4 2024-05-16 5 Thursday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 10 Mobile de 0.9366 1.48 Fast 0 9 3 India
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9995 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 34 Laptop/Desktop zh 0.9791 0.60 Instant 0 7 2 Seychelles
9996 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 19 Laptop/Desktop en 0.9132 0.83 Instant 0 5 0 Christmas Island
9997 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 29 Laptop/Desktop de 0.9516 0.94 Instant 0 10 2 Ethiopia
9998 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 21 Mobile de 0.9359 0.83 Instant 0 5 3 Puerto Rico
9999 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 58 Mobile fr 0.9399 0.76 Instant 1 7 1 Kyrgyz Republic

10000 rows Γ— 28 columns

InΒ [5]:
df.columns
Out[5]:
Index(['Date', 'Month_Num', 'Weekday', 'AI_Platform', 'AI_Model_Version',
       'Active_Users', 'New_Users', 'Churned_Users', 'Daily_Churn_Rate',
       'Retention_Rate', 'User_ID', 'Query_Type', 'Input_Text',
       'Input_Text_Length', 'Response_Tokens', 'Topic_Category', 'User_Rating',
       'User_Experience_Score', 'Session_Duration_sec', 'Device_Type',
       'Language', 'Response_Accuracy', 'Response_Speed_sec',
       'Response_Time_Category', 'Correction_Needed', 'User_Return_Frequency',
       'Customer_Support_Interactions', 'Region'],
      dtype='object')
InΒ [6]:
df.describe()
Out[6]:
Month_Num Active_Users New_Users Churned_Users Daily_Churn_Rate Retention_Rate Input_Text_Length Response_Tokens User_Rating User_Experience_Score Session_Duration_sec Response_Accuracy Response_Speed_sec Correction_Needed User_Return_Frequency Customer_Support_Interactions
count 10000.000000 1.000000e+04 10000.000000 10000.000000 10000.000000 1.000000e+04 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 9621.000000 10000.000000 10000.000000 10000.000000 10000.000000
mean 7.128900 1.196255e+06 100508.750000 35395.150000 0.035228 9.500000e-01 6.260700 274.765100 4.394700 1.626706 28.533700 0.850287 2.356651 0.144600 5.530600 1.476800
std 3.559712 7.444465e+05 85584.077151 14849.189585 0.014999 1.054765e-14 1.188561 130.077225 0.734551 0.491296 14.090348 0.072755 1.303743 0.351715 2.867906 1.120887
min 1.000000 2.000000e+05 12500.000000 4000.000000 0.020000 9.500000e-01 4.000000 50.000000 3.000000 0.480000 5.000000 0.654200 0.330000 0.000000 1.000000 0.000000
25% 4.000000 6.500000e+05 35000.000000 25000.000000 0.020000 9.500000e-01 6.000000 162.000000 4.000000 1.230000 17.000000 0.801800 1.250000 0.000000 3.000000 0.000000
50% 8.000000 9.500000e+05 52500.000000 35000.000000 0.050000 9.500000e-01 7.000000 276.000000 5.000000 1.710000 27.000000 0.862200 2.070000 0.000000 6.000000 1.000000
75% 10.000000 1.700000e+06 170000.000000 49000.000000 0.050000 9.500000e-01 7.000000 386.250000 5.000000 2.070000 38.000000 0.905000 3.450000 0.000000 8.000000 2.000000
max 12.000000 3.050000e+06 305000.000000 61000.000000 0.050000 9.500000e-01 8.000000 500.000000 5.000000 2.280000 60.000000 0.997200 5.190000 1.000000 10.000000 3.000000
InΒ [7]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 28 columns):
 #   Column                         Non-Null Count  Dtype  
---  ------                         --------------  -----  
 0   Date                           10000 non-null  object 
 1   Month_Num                      10000 non-null  int64  
 2   Weekday                        10000 non-null  object 
 3   AI_Platform                    10000 non-null  object 
 4   AI_Model_Version               10000 non-null  object 
 5   Active_Users                   10000 non-null  int64  
 6   New_Users                      10000 non-null  int64  
 7   Churned_Users                  10000 non-null  int64  
 8   Daily_Churn_Rate               10000 non-null  float64
 9   Retention_Rate                 10000 non-null  float64
 10  User_ID                        10000 non-null  object 
 11  Query_Type                     10000 non-null  object 
 12  Input_Text                     10000 non-null  object 
 13  Input_Text_Length              10000 non-null  int64  
 14  Response_Tokens                10000 non-null  int64  
 15  Topic_Category                 10000 non-null  object 
 16  User_Rating                    10000 non-null  int64  
 17  User_Experience_Score          10000 non-null  float64
 18  Session_Duration_sec           10000 non-null  int64  
 19  Device_Type                    10000 non-null  object 
 20  Language                       10000 non-null  object 
 21  Response_Accuracy              9621 non-null   float64
 22  Response_Speed_sec             10000 non-null  float64
 23  Response_Time_Category         10000 non-null  object 
 24  Correction_Needed              10000 non-null  int64  
 25  User_Return_Frequency          10000 non-null  int64  
 26  Customer_Support_Interactions  10000 non-null  int64  
 27  Region                         10000 non-null  object 
dtypes: float64(5), int64(11), object(12)
memory usage: 2.1+ MB
InΒ [8]:
df.tail(5)
Out[8]:
Date Month_Num Weekday AI_Platform AI_Model_Version Active_Users New_Users Churned_Users Daily_Churn_Rate Retention_Rate ... Session_Duration_sec Device_Type Language Response_Accuracy Response_Speed_sec Response_Time_Category Correction_Needed User_Return_Frequency Customer_Support_Interactions Region
9995 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 34 Laptop/Desktop zh 0.9791 0.60 Instant 0 7 2 Seychelles
9996 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 19 Laptop/Desktop en 0.9132 0.83 Instant 0 5 0 Christmas Island
9997 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 29 Laptop/Desktop de 0.9516 0.94 Instant 0 10 2 Ethiopia
9998 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 21 Mobile de 0.9359 0.83 Instant 0 5 3 Puerto Rico
9999 2024-05-17 5 Friday DeepSeek DeepSeek-Chat 1.5 1700000 170000 34000 0.02 0.95 ... 58 Mobile fr 0.9399 0.76 Instant 1 7 1 Kyrgyz Republic

5 rows Γ— 28 columns

InΒ [9]:
df.isnull().sum()
Out[9]:
Date                               0
Month_Num                          0
Weekday                            0
AI_Platform                        0
AI_Model_Version                   0
Active_Users                       0
New_Users                          0
Churned_Users                      0
Daily_Churn_Rate                   0
Retention_Rate                     0
User_ID                            0
Query_Type                         0
Input_Text                         0
Input_Text_Length                  0
Response_Tokens                    0
Topic_Category                     0
User_Rating                        0
User_Experience_Score              0
Session_Duration_sec               0
Device_Type                        0
Language                           0
Response_Accuracy                379
Response_Speed_sec                 0
Response_Time_Category             0
Correction_Needed                  0
User_Return_Frequency              0
Customer_Support_Interactions      0
Region                             0
dtype: int64
InΒ [10]:
df['Response_Accuracy'].fillna(df['Response_Accuracy'].median(), inplace=True)
C:\Users\ABHISHEK\AppData\Local\Temp\ipykernel_21244\505380775.py:1: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['Response_Accuracy'].fillna(df['Response_Accuracy'].median(), inplace=True)
InΒ [11]:
def detect_outliers_iqr(df, column):
    Q1 = df[column].quantile(0.25)
    Q3 = df[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    outliers = df[(df[column] < lower_bound) | (df[column] > upper_bound)]
    return outliers

# Columns to check for outliers
num_cols = ['Response_Accuracy', 'User_Rating', 'Session_Duration_sec', 'Response_Speed_sec']

# Detect outliers in selected numerical columns
for col in num_cols:
    outliers = detect_outliers_iqr(df, col)
    print(f"Outliers in {col}: {len(outliers)}")
Outliers in Response_Accuracy: 16
Outliers in User_Rating: 0
Outliers in Session_Duration_sec: 0
Outliers in Response_Speed_sec: 0
InΒ [12]:
def replace_outliers(df, column, method="median"):
    Q1 = df[column].quantile(0.25)
    Q3 = df[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR

    if method == "median":
        replacement = df[column].median()
    else:
        replacement = df[column].mean()
    
    df[column] = np.where((df[column] < lower_bound) | (df[column] > upper_bound), replacement, df[column])

# Apply to numerical columns
for col in num_cols:
    replace_outliers(df, col, method="median")

print("Outliers replaced with median values!")
Outliers replaced with median values!
InΒ [13]:
def remove_outliers(df, column):
    Q1 = df[column].quantile(0.25)
    Q3 = df[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    return df[(df[column] >= lower_bound) & (df[column] <= upper_bound)]

# Apply to selected numerical columns
for col in num_cols:
    df = remove_outliers(df, col)

print("Outliers removed!")
Outliers removed!
InΒ [14]:
plt.figure(figsize=(12, 6))
for i, col in enumerate(num_cols, 1):
    plt.subplot(2, 2, i)
    sns.boxplot(x=df[col], color='green')
    plt.title(f"Boxplot After Outlier Handling: {col}")
plt.tight_layout()
plt.show()
No description has been provided for this image
InΒ [15]:
plt.figure(figsize=(8, 5))
sns.histplot(df['User_Rating'], bins=10, kde=True, color='blue')
plt.title("Distribution of User Ratings")
plt.xlabel("User Rating")
plt.ylabel("Frequency")
plt.show()
No description has been provided for this image
InΒ [16]:
plt.figure(figsize=(8, 5))
sns.boxplot(x=df['Response_Accuracy'], color='green')
plt.title("Boxplot of Response Accuracy")
plt.show()
No description has been provided for this image
InΒ [17]:
plt.figure(figsize=(8, 5))
sns.barplot(x=df['AI_Platform'], y=df['Active_Users'], palette="coolwarm")
plt.title("Active Users per AI Platform")
plt.xlabel("AI Platform")
plt.ylabel("Number of Active Users")
plt.show()
C:\Users\ABHISHEK\AppData\Local\Temp\ipykernel_21244\1770559313.py:2: FutureWarning: 

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(x=df['AI_Platform'], y=df['Active_Users'], palette="coolwarm")
No description has been provided for this image
InΒ [18]:
plt.figure(figsize=(8, 5))
sns.scatterplot(x=df['Response_Accuracy'], y=df['User_Rating'], hue=df['AI_Platform'])
plt.title("User Rating vs Response Accuracy")
plt.xlabel("Response Accuracy")
plt.ylabel("User Rating")
plt.show()
No description has been provided for this image
InΒ [19]:
plt.figure(figsize=(8, 5))
sns.boxplot(x=df['AI_Platform'], y=df['Response_Accuracy'], palette="Set2")
plt.title("Response Accuracy by AI Platform")
plt.show()
C:\Users\ABHISHEK\AppData\Local\Temp\ipykernel_21244\3714208981.py:2: FutureWarning: 

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.

  sns.boxplot(x=df['AI_Platform'], y=df['Response_Accuracy'], palette="Set2")
No description has been provided for this image
InΒ [20]:
plt.figure(figsize=(10, 5))
sns.countplot(x=df['AI_Model_Version'], hue=df['AI_Platform'], palette="Set1")
plt.xticks(rotation=45)
plt.title("AI Model Version Distribution")
plt.show()
No description has been provided for this image
InΒ [21]:
import plotly.express as px
fig = px.scatter(df, x='Response_Speed_sec', y='User_Experience_Score', color='AI_Platform',
                 title="Response Speed vs User Experience Score", size='User_Experience_Score')
fig.show()
InΒ [22]:
fig = px.bar(df, x='Region', y='Active_Users', color='AI_Platform', 
             title="Active Users per Region", barmode="group")
fig.show()
InΒ [23]:
from wordcloud import WordCloud, STOPWORDS
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
plt.figure(figsize=(12, 6))

for platform in df['AI_Platform'].unique():
    platform_text = " ".join(df[df['AI_Platform'] == platform]['Input_Text'].dropna())
    platform_wordcloud = WordCloud(width=800, height=400, background_color='black',
                                   colormap="plasma").generate(platform_text)
    
    plt.subplot(1, 2, list(df['AI_Platform'].unique()).index(platform) + 1)
    plt.imshow(platform_wordcloud, interpolation="bilinear")
    plt.axis("off")
    plt.title(f"Most Searched Terms - {platform}")

plt.tight_layout()
plt.show()
No description has been provided for this image
InΒ [24]:
def generate_wordcloud(column_name, bg_color="white", cmap="coolwarm"):
    plt.figure(figsize=(10, 5))

    # Combine all text from the column
    text = " ".join(df[column_name].dropna().astype(str))

    # Create WordCloud
    wordcloud = WordCloud(width=800, height=400, background_color=bg_color,
                          stopwords=STOPWORDS, colormap=cmap).generate(text)

    # Display WordCloud
    plt.imshow(wordcloud, interpolation="bilinear")
    plt.axis("off")
    plt.title(f"WordCloud for {column_name}", fontsize=14)
    plt.show()
InΒ [25]:
generate_wordcloud("Query_Type")
No description has been provided for this image
InΒ [26]:
generate_wordcloud("Topic_Category", bg_color="black", cmap="plasma")
No description has been provided for this image
InΒ [27]:
generate_wordcloud("Language", bg_color="white", cmap="viridis")
No description has been provided for this image
InΒ [28]:
generate_wordcloud("Device_Type", bg_color="white", cmap="inferno")
No description has been provided for this image
InΒ [29]:
plt.figure(figsize=(12, 6))
sns.lineplot(data=df, x="Date", y="Active_Users", hue="AI_Platform", marker="o", linewidth=2)
plt.xticks(rotation=45)
plt.title("Active Users Trend Over Time (Deepseek vs ChatGPT)", fontsize=14)
plt.xlabel("Date")
plt.ylabel("Active Users")
plt.legend(title="AI Platform")
plt.grid(True)
plt.show()
No description has been provided for this image
InΒ [30]:
fig = px.bar(df, x="AI_Model_Version", y="Active_Users", color="AI_Platform",
             title="AI Model Version Usage Over Time", barmode="stack")
fig.show()
InΒ [31]:
sns.jointplot(data=df, x="Response_Accuracy", y="User_Rating", hue="AI_Platform", kind="scatter", height=7)
plt.suptitle("User Rating vs Response Accuracy", fontsize=14)
plt.show()
No description has been provided for this image
InΒ [32]:
plt.figure(figsize=(10, 5))
sns.heatmap(df.head(25).pivot_table(index="Session_Duration_sec", columns="User_Experience_Score", 
                           values="Active_Users", aggfunc="sum").fillna(0), cmap="coolwarm", annot=True)
plt.title("User Engagement Heatmap: Session Duration vs Experience Score")
plt.xlabel("User Experience Score")
plt.ylabel("Session Duration (sec)")
plt.show()
No description has been provided for this image
InΒ [33]:
fig = px.choropleth(df, locations="Region", locationmode="country names", 
                     color="Active_Users", hover_name="Region",
                     title="Most Active Regions (Deepseek vs ChatGPT)", color_continuous_scale="viridis")
fig.show()
InΒ [34]:
fig = px.scatter(df, x="Response_Speed_sec", y="User_Experience_Score", 
                 size="User_Experience_Score", color="AI_Platform",
                 hover_data=["AI_Model_Version", "Response_Accuracy"],
                 title="Response Speed vs User Experience Score")
fig.show()
InΒ [35]:
fig = px.sunburst(df, path=["AI_Platform", "Topic_Category", "Query_Type"], 
                  values="Active_Users", color="AI_Platform",
                  title="AI Query Distribution Across Topics")
fig.show()
InΒ [36]:
fig, ax1 = plt.subplots(figsize=(12, 6))

# First line plot for Daily Churn Rate
sns.lineplot(data=df, x="Date", y="Daily_Churn_Rate", hue="AI_Platform", marker="o", ax=ax1)
ax1.set_ylabel("Daily Churn Rate (%)", color="red")
ax1.tick_params(axis="y", labelcolor="red")

# Second line plot for Retention Rate on the same axis
ax2 = ax1.twinx()
sns.lineplot(data=df, x="Date", y="Retention_Rate", hue="AI_Platform", marker="s", linestyle="dashed", ax=ax2)
ax2.set_ylabel("Retention Rate (%)", color="blue")
ax2.tick_params(axis="y", labelcolor="blue")

plt.title("Daily Churn Rate vs Retention Rate Over Time", fontsize=14)
plt.xticks(rotation=45)
plt.show()
No description has been provided for this image
InΒ [37]:
plt.figure(figsize=(12, 6))
sns.countplot(data=df, y="Query_Type", hue="AI_Platform", order=df["Query_Type"].value_counts().index, palette="viridis")
plt.title("Most Common Query Types by AI Platform", fontsize=14)
plt.xlabel("Count")
plt.ylabel("Query Type")
plt.show()
No description has been provided for this image
InΒ [38]:
fig = px.scatter(df, x="Response_Accuracy", y="User_Experience_Score", size="Active_Users", color="Topic_Category",
                 hover_data=["AI_Platform"], title="Response Accuracy vs User Experience by Topic")
fig.show()
InΒ [39]:
df_device = df["Device_Type"].value_counts().reset_index()
df_device.columns = ["Device_Type", "Count"]

plt.figure(figsize=(8, 8))
plt.pie(df_device["Count"], labels=df_device["Device_Type"], autopct="%1.1f%%", startangle=90, colors=["blue", "green", "orange", "red"])
plt.title("Device Type Distribution")
plt.show()
No description has been provided for this image
InΒ [40]:
plt.figure(figsize=(10, 5))
sns.boxplot(data=df, x="AI_Platform", y="Customer_Support_Interactions", palette="coolwarm")
plt.title("Customer Support Interactions Across AI Platforms", fontsize=14)
plt.xlabel("AI Platform")
plt.ylabel("Support Interactions")
plt.show()
C:\Users\ABHISHEK\AppData\Local\Temp\ipykernel_21244\3627237335.py:2: FutureWarning:



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.


No description has been provided for this image
InΒ [41]:
plt.figure(figsize=(10, 6))
heatmap_data = df.head(150).pivot_table(index="Weekday", columns="Month_Num", values="Active_Users", aggfunc="sum")
sns.heatmap(heatmap_data, cmap="coolwarm", annot=True, fmt=".0f")
plt.title("Active Users Heatmap: Weekdays vs Months", fontsize=14)
plt.xlabel("Month")
plt.ylabel("Weekday")
plt.show()
No description has been provided for this image
InΒ [42]:
plt.figure(figsize=(10, 6))
sns.violinplot(data=df, x="AI_Platform", y="User_Return_Frequency", palette="pastel", inner="quartile")
plt.title("User Return Frequency Across AI Platforms", fontsize=14)
plt.xlabel("AI Platform")
plt.ylabel("User Return Frequency")
plt.show()
C:\Users\ABHISHEK\AppData\Local\Temp\ipykernel_21244\4199882007.py:2: FutureWarning:



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.


No description has been provided for this image
InΒ [44]:
import plotly.graph_objects as go

# Ensure 'Date' is in datetime format
df["Date"] = pd.to_datetime(df["Date"])

# Check for AI Platform values
deepseek_data = df[df["AI_Platform"].str.strip().eq("Deepseek")]
chatgpt_data = df[df["AI_Platform"].str.strip().eq("ChatGPT")]

fig = go.Figure()

# Add Deepseek data if available
if not deepseek_data.empty:
    fig.add_trace(go.Scatter(x=deepseek_data["Date"], 
                             y=deepseek_data["Active_Users"], 
                             mode="lines+markers", name="Deepseek", line=dict(color="green")))

# Add ChatGPT data if available
if not chatgpt_data.empty:
    fig.add_trace(go.Scatter(x=chatgpt_data["Date"], 
                             y=chatgpt_data["Active_Users"], 
                             mode="lines+markers", name="ChatGPT", line=dict(color="red")))

fig.update_layout(title="Active Users Over Time (Deepseek vs ChatGPT)",
                  xaxis_title="Date",
                  yaxis_title="Active Users",
                  template="plotly_dark")

fig.show()
InΒ [45]:
import streamlit as st
import pandas as pd
import plotly.express as px
from datetime import datetime
import os

# **Session Tracking File**
SESSION_FILE = "user_sessions.csv"

# **Load Dataset**
@st.cache_data
def load_data():
    return pd.read_csv("deepseek_vs_chatgpt.csv")  # Replace with actual dataset

df = load_data()

# **Function to Load User Sessions**
def load_session_data():
    if os.path.exists(SESSION_FILE):
        return pd.read_csv(SESSION_FILE)
    else:
        return pd.DataFrame(columns=["timestamp", "username", "search", "platform"])

# **Function to Save Session Data**
def save_session(username, search, platform):
    session_data = load_session_data()
    new_entry = pd.DataFrame([{"timestamp": datetime.now(), "username": username, "search": search, "platform": platform}])
    session_data = pd.concat([session_data, new_entry], ignore_index=True)
    session_data.to_csv(SESSION_FILE, index=False)

# **Sidebar Navigation**
st.sidebar.title("Navigation")
page = st.sidebar.radio("Go to", ["πŸ“Š Admin Dashboard", "πŸ” AI Search Dashboard"])

# **Admin Dashboard**
if page == "πŸ“Š Admin Dashboard":
    st.title("πŸ“Š Admin Dashboard - User Tracking")

    # **Load User Sessions**
    session_data = load_session_data()

    # **Total Logins**
    st.subheader("πŸ‘€ Total Logins")
    total_logins = session_data["username"].nunique()
    st.metric(label="Total Users Logged In", value=total_logins)

    # **Most Active Users**
    st.subheader("πŸ”₯ Most Active Users")
    user_counts = session_data["username"].value_counts().reset_index()
    user_counts.columns = ["User", "Login Count"]
    st.dataframe(user_counts)

    # **Recent Sessions**
    st.subheader("πŸ•’ Recent User Sessions")
    st.dataframe(session_data.tail(10))

    # **User Search & Filtering**
    st.subheader("πŸ” Search User Activity")
    search_user = st.text_input("Enter username to filter activity")
    if search_user:
        user_activity = session_data[session_data["username"] == search_user]
        st.dataframe(user_activity)

# **AI Search Dashboard**
else:
    st.title("πŸ” AI Search Dashboard")

    st.sidebar.title("πŸ” Filter Options")
    selected_platform = st.sidebar.selectbox("Select AI Platform", df["AI_Platform"].unique())

    search_query = st.sidebar.text_input("Search Query Type")
    username = st.sidebar.text_input("Enter your username")  # Manual user input

    if username:
        save_session(username, search_query, selected_platform)

    filtered_df = df[df["AI_Platform"] == selected_platform]
    if search_query:
        filtered_df = filtered_df[filtered_df["Query_Type"].astype(str).str.contains(search_query, case=False, na=False)]

    st.subheader("πŸ“‹ Filtered Data")
    st.dataframe(filtered_df)
2025-03-15 20:27:15.117 WARNING streamlit.runtime.caching.cache_data_api: No runtime found, using MemoryCacheStorageManager
2025-03-15 20:27:15.120 WARNING streamlit.runtime.caching.cache_data_api: No runtime found, using MemoryCacheStorageManager
2025-03-15 20:27:15.122 WARNING streamlit.runtime.scriptrunner_utils.script_run_context: Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.355 
  Warning: to view this Streamlit app on a browser, run it with the following
  command:

    streamlit run C:\Users\ABHISHEK\anaconda3\Lib\site-packages\ipykernel_launcher.py [ARGUMENTS]
2025-03-15 20:27:15.356 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.359 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.442 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.443 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.447 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.450 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.452 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.454 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.458 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.461 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.463 Session state does not function when running a script without `streamlit run`
2025-03-15 20:27:15.465 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.467 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.476 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.479 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.483 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.497 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.499 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.500 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.502 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.503 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.534 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.535 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.537 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.538 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.540 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.542 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.544 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.545 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.546 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.547 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.547 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.549 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.549 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-03-15 20:27:15.550 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
InΒ [Β ]: